Using Memory-Style Storage to Support Fault Tolerance in Data Centers

نویسندگان

  • Xiao Liu
  • Qing Yi
  • Jishen Zhao
چکیده

Next-generation nonvolatile memories combine byteaddressability and high performance of memory with nonvolatility of disk/flash. They promise emerging memory-style storage (MSS) systems that are directly attached to the memory bus, offering fast load/store access and data persistence in a single level of storage. MSS can be especially attractive in data centers, where fault tolerance support through storage systems is critical to performance and energy. Yet existing fault tolerance mechanisms, such as logging and checkpointing, are designed for slow block-level storage interfaces; their design choices are not wholly suitable for MSS. The goal of this work is to explore efficient fault tolerance techniques that exploit the fast memory interface and the nature of single-level storage. Our preliminary exploration shows that, by reducing data duplication and increasing application parallelism, such techniques can substantially improve system performance and energy consumption.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rack-Aware Regenerating Codes for Data Centers

Erasure coding is widely used for massive storage in data centers to achieve high fault tolerance and low storage redundancy. Since the cross-rack communication cost is often high, it is critical to design erasure codes that minimize the cross-rack repair bandwidth during failure repair. In this paper, we analyze the optimal trade-off between storage redundancy and cross-rack repair bandwidth s...

متن کامل

Caribou: Intelligent Distributed Storage

The ever increasing amount of data being handled in data centers causes an intrinsic inefficiency: moving data around is expensive in terms of bandwidth, latency, and power consumption, especially given the low computational complexity of many database operations. In this paper we explore near-data processing in database engines, i.e., the option of offloading part of the computation directly t...

متن کامل

Partial Replication for Software Transactional Memory Systems

Nowadays, transactional in-memory distributed storage systems are widely used as a mean to increase the performance of applications that need to access frequently large amount of shared data. In this context, data replication has two main advantages: it supports load balancing and fault-tolerance. However, these advantages need to be weighted against the costs of replications: namely memory con...

متن کامل

Partial Replication on Transactional Memory Systems

Nowadays, transactional in-memory distributed storage systems are widely used as a mean to increase the performance of applications that need to access frequently large amount of shared data. In this context, data replication has two main advantages: it supports load balancing and fault-tolerance. However, these advantages need to be weighted against the costs of replications: namely memory con...

متن کامل

Adaptive Checkpointing

Checkpointing is a typical approach to tolerate failures in today’s supercomputing clusters and computational grids. Checkpoint data can be saved either in central stable storage, or in processor memory (as in diskless checkpointing), or local disk space (replacing memory with local disk in diskless checkpointing). But where to save the checkpoint data has a great impact on the performance of a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016